A New DBMS Architecture for DB-IR Integration

نویسنده

  • Kyu-Young Whang
چکیده

Nowadays, as there is an increasing need to integrate the DBMS (for structured data) with Information Retrieval (IR) features (for unstructured data), DB-IR integration becomes one of major challenges in the database area[1,2]. Extensible architectures provided by commercial ORDBMS vendors can be used for DB-IR integration. Here, extensions are implemented using a high-level (typically, SQL-level) interface. We call this architecture loose-coupling. The advantage of loose-coupling is that it is easy to implement. But, it is not preferable for implementing new data types and operations in large databases when high performance is required. In this talk, we present a new DBMS architecture applicable to DB-IR integration, which we call tight-coupling. In tight-coupling, new data types and operations are integrated into the core of the DBMS engine in the extensible type layer. Thus, they are incorporated as the “first-class citizens”[1] within the DBMS architecture and are supported in a consistent manner with high performance. This tight-coupling architecture is being used to incorporate IR features and spatial database features into the Odysseus ORDBMS that has been under development at KAIST/AITrc for over 16 years[3]. In this talk, we introduce Odysseus and explain its tightly-coupled IR features (U.S. patented in 2002[2]). Then, we demonstrate excellence of tight-coupling by showing benchmark results. We have built a web search engine that is capable of managing 20∼100 million web pages in a non-parallel configuration using Odysseus. This engine has been successfully tested in many commercial environments. In a parallel configuration, it is capable of managing billons of web pages. This work won the Best Demonstration Award from the IEEE ICDE conference held in Tokyo, Japan in April 2005[3].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The QUIQ Engine: A Hybrid IR DB System

For applications that involve rapidly changing textual data and also require traditional DBMS capabilities, current systems are unsatisfactory. In this paper, we describe a hybrid IR-DB system that serves as the basis for the QUIQConnect product, a collaborative customer support application. We present the novel query paradigm and system architecture, along with performance results.

متن کامل

ODYS: A Massively-Parallel Search Engine Using a DB-IR Tightly-Integrated Parallel DBMS

Recently, parallel search engines have been implemented based on scalable distributed file systems such as Google File System. However, we claim that building a massively-parallel search engine using a parallel DBMS can be an attractive alternative since it supports a higher-level (i.e., SQL-level) interface than that of a distributed file system for easy and less error-prone application develo...

متن کامل

The System Architecture and the Transaction Concept of the SPIDER Information Retrieval System

A relational database (DB) management system is extended by the SPIDER Information Retrieval (IR) system to provide IR operations on data stored in the DB system. The IR data stored in the IR system is obtained by analyzing the objects in the database. This derived data is needed for advanced IR operations such as relevance ranking and relevance feedback. It is kept consistent with the DB data ...

متن کامل

IRO - DB An object - oriented approach towards federated and interoperable DBMS 1

Todays application scenarios need more and more access to information stored and distributed among multiple database management systems which have various underlying data models and which model even the same real world aspects differently with respect to structure and granularity. Therefore, a system is needed which addresses these problems, providing the means to integrate heterogeneous data s...

متن کامل

OMS Java: Lessons Learned from Building a Multi-Tier Object Management Framework

We present the object-oriented multi-tier application framework OMS Java which is independent of the underlying database management system (DBMS). We detail the storage management component and sketch which part of the framework has to be extended when introducing a new DBMS. We compare versions of OMS Java using the persistent storage engine ObjectStore PSE Pro for Java, the object-oriented DB...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007